425 research outputs found

    BNC! Handle with care! Spelling and tagging errors in the BNC

    Get PDF
    "You loose your no-claims bonus," instead of "You lose your no-claims bonus," is an example of a real-word spelling error. One way to enable a spellchecker to detect such errors is to prime it with information about likely features of the context for "loose" (verb) as compared with "lose". To this end, we extracted all the examples of "loose" used as a verb from the BNC (World edition, text). There were, apparently, 159 occurrences of "loose" (VVB or VVI). However, on inspection, well over half of these were not verbs at all (tagging errors) and over half of the rest were misspellings of "lose". Only about 15% were actual occurrences of "loose" as a verb. This prompted us to undertake a small investigation into errors in the BNC. We report on some words that occur more often as misspellings than in their own right - only one of the 63 occurrences of "ail", for example, is correct (possibly OCR errors) - and some words that are always mistagged, such as "haulier" and "glazier" (never NN), and "hanker" and "loiter" (never VV). We note in particular that, if a rare word resembles a common word (in spelling), it is more likely to appear as a misspelling of the common word than as a correct spelling of the rare word. These cases require some modification of an earlier conclusion (Damerau and Mays, 1989) on misspellings of rare words. We conclude with a discussion of the desirability, or otherwise, of correcting errors in corpora such as the BNC. The results may be of interest to people who use the BNC as training data or for teaching

    Project communication variables : a comparative study of US and UK industry perceptions

    Get PDF
    Research undertaken at the Construction Industry Institute (CII) in the USA has indicated the need for project managers to focus their attention on six ‘Critical Communication Variables’ as a means of ensuring the fulfillment of time cost and quality targets. These variables refer to the accuracy, timeliness and completeness of information presented to participants, as well as the level of understanding, barriers to and procedures for project based communication. The findings and tools generated by the CII study have been used as part of case study based research examining construction projects in the Central Belt region of Scotland. In addition to the CII data collection tools employed, the Scottish study included semi-structured interviews as a means of contextualising the communication and decision-making taking place. This paper presents the results of this benchmarking exercise, and highlights significant issues that project team members need to improve upon in order to achieve the timeliness quality and cost required in today’s construction industr

    Fast, scalable and reliable generation of controlled natural language

    Get PDF

    Evolutionary Psychology, Meet Developmental Neurobiology: Against Promiscuous Modularity

    Get PDF
    This revised version was published online in July 2006 with corrections to the Cover Date.Evolutionary psychologists claim that the mind contains “hundreds or thousands” of “genetically specified” modules, which are evolutionary adaptations for their cognitive functions. We argue that, while the adult human mind/brain typically contains a degree of modularization, its “modules” are neither genetically specified nor evolutionary adaptations. Rather, they result from the brain's developmental plasticity, which allows environmental task demands a large role in shaping the brain's information-processing structures. The brain's developmental plasticity is our fundamental psychological adaptation, and the “modules” that result from it are adaptive responses to local conditions, not past evolutionary environments. If different individuals share common environments, however, they may develop similar “modules,” and this process can mimic the development of genetically specified modules in the evolutionary psychologist's sense

    Lexical Parameters, Based on Corpus Analysis of English and Swedish Cancer Data, of Relevance for NLG

    Get PDF
    Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit. University of Tartu, Tartu, 2007. ISBN 978-9985-4-0513-0 (online) ISBN 978-9985-4-0514-7 (CD-ROM) pp. 333-336

    Towards annotating the plant epigenome: the Arabidopsis thaliana small RNA locus map.

    Get PDF
    Based on 98 public and internal small RNA high throughput sequencing libraries, we mapped small RNAs to the genome of the model organism Arabidopsis thaliana and defined loci based on their expression using an empirical Bayesian approach. The resulting loci were subsequently classified based on their genetic and epigenetic context as well as their expression properties. We present the results of this classification, which broadly conforms to previously reported divisions between transcriptional and post-transcriptional gene silencing small RNAs, and to PolIV and PolV dependencies. However, we are able to demonstrate the existence of further subdivisions in the small RNA population of functional significance. Moreover, we present a framework for similar analyses of small RNA populations in all species
    • …
    corecore